2008), “Convergence Proofs of Least Squares Policy Iteration Algorithm for High-Dimensional Infinite Horizon Markov Decision Process Problems,” working paper

Convergence of Simulation-Based Policy Iteration

William Cooper
Shane Henderson
Mark Lewis

January 2003

Simulation-based policy iteration (SBPI) is a modification of the policy iteration algorithm for com...

Performance Bounds for λ-Policy Iteration and Application to the Game of Tetris,” INRIA Lorraine Report

Bruno Scherrer
Inria Lorraine
Shie Mannor

January 2011

We consider the discrete-time infinite-horizon optimal control problem formalized by Markov de-cisio...

Multiagent value iteration algorithms in dynamic programming and reinforcement learning

Dimitri Bertsekas

December 2020

We consider infinite horizon dynamic programming problems, where the control at each stage consists ...

A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional markov decision process with continuous state and action spaces

Jun Ma
Warren B. Powell

January 2009

Abstract — In this paper, we present a recursive least squares approximate policy iteration (RLSAPI)...

Incremental least squares policy iteration for POMDPs

Hui Li
Xuejun Liao
Lawrence Carin

January 2015

We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding t...

Convergence Results for Some Temporal Difference Methods Based on Least Squares

Yu, Huizhen
Bertsekas, Dimitri P.

August 2008

We consider finite-state Markov decision processes, and prove convergence and rate of convergence re...

Modified Policy Iteration Algorithms for Discounted Markov Decision Problems

Martin L. Puterman
Moon Chirl Shin

In this paper we study a class of modified policy iteration algorithms for solving Markov decision p...

Tight performance bounds for approximate modified policy iteration with non-stationary policies

Boris Lesner
Bruno Scherrer
Inria Nancy Gr
Team Maia

January 2013

We consider approximate dynamic programming for the infinite-horizon stationary γ-discounted optimal...

Solution of MDPS using simulation-based value iteration

Abdulla, Mohammed Shahid
Bhatnagar, Shalabh

January 2005

This article proposes a three-timescale simulation based algorithm for solution of infinite horizon ...

Acta Cybernetica 00 (0000) 1–21. Factored Value Iteration Converges

January 2016

In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solu...

Non-stationary approximate modified policy iteration

Lesner, Boris
Scherrer, Bruno

July 2015

International audienceWe consider the infinite-horizon γ-discounted optimal control problem formaliz...

Non-Stationary Approximate Modified Policy Iteration

Boris Lesner
Bruno Scherrer

October 2016

We consider the infinite-horizon γ-discounted optimal control problem formalized by Markov Decision ...

Improved bound on the worst case complexity of Policy Iteration

Hollanders, Romain
Gerencser, Balazs
Delvenne, Jean-Charles
Jungers, Raphaël M.

January 2016

Solving Markov Decision Processes is a recurrent task in engineering which can be performed efficien...

Approximate policy iteration for Markov decision processes via quantitative adaptive aggregations

Abate, A
Češka, M
Kwiatkowska, M

September 2016

We consider the problem of finding an optimal policy in a Markov decision process that maximises the...

Factored value iteration converges

Szita István
Lőrincz András

January 2008

In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solu...

Convergence of Simulation-Based Policy Iteration

William Cooper
Shane Henderson
Mark Lewis

January 2003

Simulation-based policy iteration (SBPI) is a modification of the policy iteration algorithm for com...

Performance Bounds for λ-Policy Iteration and Application to the Game of Tetris,” INRIA Lorraine Report

Bruno Scherrer
Inria Lorraine
Shie Mannor

January 2011

We consider the discrete-time infinite-horizon optimal control problem formalized by Markov de-cisio...

Multiagent value iteration algorithms in dynamic programming and reinforcement learning

Dimitri Bertsekas

December 2020

We consider infinite horizon dynamic programming problems, where the control at each stage consists ...

A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional markov decision process with continuous state and action spaces

Jun Ma
Warren B. Powell

January 2009

Abstract — In this paper, we present a recursive least squares approximate policy iteration (RLSAPI)...

Incremental least squares policy iteration for POMDPs

Hui Li
Xuejun Liao
Lawrence Carin

January 2015

We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding t...

Convergence Results for Some Temporal Difference Methods Based on Least Squares

Yu, Huizhen
Bertsekas, Dimitri P.

August 2008

We consider finite-state Markov decision processes, and prove convergence and rate of convergence re...

Modified Policy Iteration Algorithms for Discounted Markov Decision Problems

Martin L. Puterman
Moon Chirl Shin

In this paper we study a class of modified policy iteration algorithms for solving Markov decision p...

Tight performance bounds for approximate modified policy iteration with non-stationary policies

Boris Lesner
Bruno Scherrer
Inria Nancy Gr
Team Maia

January 2013

We consider approximate dynamic programming for the infinite-horizon stationary γ-discounted optimal...

Solution of MDPS using simulation-based value iteration

Abdulla, Mohammed Shahid
Bhatnagar, Shalabh

January 2005

This article proposes a three-timescale simulation based algorithm for solution of infinite horizon ...

Acta Cybernetica 00 (0000) 1–21. Factored Value Iteration Converges

January 2016

In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solu...

Non-stationary approximate modified policy iteration

Lesner, Boris
Scherrer, Bruno

July 2015

International audienceWe consider the infinite-horizon γ-discounted optimal control problem formaliz...

Non-Stationary Approximate Modified Policy Iteration

Boris Lesner
Bruno Scherrer

October 2016

We consider the infinite-horizon γ-discounted optimal control problem formalized by Markov Decision ...

Improved bound on the worst case complexity of Policy Iteration

Hollanders, Romain
Gerencser, Balazs
Delvenne, Jean-Charles
Jungers, Raphaël M.

January 2016

Solving Markov Decision Processes is a recurrent task in engineering which can be performed efficien...

Approximate policy iteration for Markov decision processes via quantitative adaptive aggregations

Abate, A
Češka, M
Kwiatkowska, M

September 2016

We consider the problem of finding an optimal policy in a Markov decision process that maximises the...

Factored value iteration converges

Szita István
Lőrincz András

January 2008

In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solu...

Convergence of Simulation-Based Policy Iteration

William Cooper
Shane Henderson
Mark Lewis

January 2003

Simulation-based policy iteration (SBPI) is a modification of the policy iteration algorithm for com...

Performance Bounds for λ-Policy Iteration and Application to the Game of Tetris,” INRIA Lorraine Report

Bruno Scherrer
Inria Lorraine
Shie Mannor

January 2011

We consider the discrete-time infinite-horizon optimal control problem formalized by Markov de-cisio...

Multiagent value iteration algorithms in dynamic programming and reinforcement learning

Dimitri Bertsekas

December 2020

We consider infinite horizon dynamic programming problems, where the control at each stage consists ...

2008), “Convergence Proofs of Least Squares Policy Iteration Algorithm for High-Dimensional Infinite Horizon Markov Decision Process Problems,” working paper

Abstract

Extracted data

2008), “Convergence Proofs of Least Squares Policy Iteration Algorithm for High-Dimensional Infinite Horizon Markov Decision Process Problems,” working paper

Abstract

Extracted data

Related items

Related items